首页> 外文OA文献 >Detecting Events and Patterns in Large-Scale User Generated Textual Streams with Statistical Learning Methods
【2h】

Detecting Events and Patterns in Large-Scale User Generated Textual Streams with Statistical Learning Methods

机译:检测大规模用户生成文本中的事件和模式   流统计学习方法

摘要

A vast amount of textual web streams is influenced by events or phenomenaemerging in the real world. The social web forms an excellent modern paradigm,where unstructured user generated content is published on a regular basis andin most occasions is freely distributed. The present Ph.D. Thesis deals withthe problem of inferring information - or patterns in general - about eventsemerging in real life based on the contents of this textual stream. We showthat it is possible to extract valuable information about social phenomena,such as an epidemic or even rainfall rates, by automatic analysis of thecontent published in Social Media, and in particular Twitter, using StatisticalMachine Learning methods. An important intermediate task regards the formationand identification of features which characterise a target event; we select anduse those textual features in several linear, non-linear and hybrid inferenceapproaches achieving a significantly good performance in terms of the appliedloss function. By examining further this rich data set, we also propose methodsfor extracting various types of mood signals revealing how affective norms - atleast within the social web's population - evolve during the day and howsignificant events emerging in the real world are influencing them. Lastly, wepresent some preliminary findings showing several spatiotemporalcharacteristics of this textual information as well as the potential of usingit to tackle tasks such as the prediction of voting intentions.
机译:大量文本Web流受现实世界中发生的事件或现象的影响。社交网络形成了一种出色的现代范例,其中定期发布非结构化用户生成的内容,并且在大多数情况下可以自由分发。现任博士本文涉及根据文本流的内容来推断有关现实生活中发生的事件的信息(或一般模式)的问题。我们证明,可以通过使用统计机器学习方法自动分析社交媒体(尤其是Twitter)中发布的内容来提取有关社会现象的有价值的信息,例如流行病甚至降雨率。一项重要的中间任务涉及表征和表征目标事件的特征;我们在几种线性,非线性和混合推理方法中选择并使用这些文本特征,从而在Appliedloss函数方面实现了显着良好的性能。通过进一步检查这个丰富的数据集,我们还提出了提取各种类型的情绪信号的方法,这些方法揭示了情感规范(至少在社交网络中的人口)在一天中如何演变以及现实世界中出现的重大事件如何对其产生影响。最后,我们提供一些初步的发现,这些发现显示了此文本信息的几种时空特性,以及利用它来处理诸如预测投票意向之类的任务的潜力。

著录项

  • 作者

    Lampos, Vasileios;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号